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Abstract 

We investigate how well the graph of a bilinear function b : [0, l] n —>• M can be 
approximated by its McCormick relaxation. In particular, we are interested in the 
smallest number c such that the difference between the concave upper bounding and 
convex lower bounding functions obtained from the McCormick relaxation approach is 
at most c times the difference between the concave and convex envelopes. Answering 
a question of Luedtke, Namazifar and Linderoth, we show that this factor c cannot be 
bounded by a constant independent of n. More precisely, we show that for a random 
bilinear function b we have asymptotically almost surely c ^ y/n/A. On the other hand, 
we prove that c ^ 600 y/n, which improves the linear upper bound proved by Luedtke, 
Namazifar and Linderoth. In addition, we present an alternative proof for a result of 
Misener, Smadbeck and Floudas characterizing functions b for which the McCormick 
relaxation is equal to the convex hull. 

An important technique in global optimization is the construction of convex envelopes for 
nonconvex functions over convex sets (see for instance [9]), and consequently, there has been 
a lot of work on such envelopes of special classes of functions [1, 5, 15, 17, 19]. Many modern 
global optimization solvers [2, 16, 18] follow a general approach, proposed by McCormick [11], 
that is based on a linear relaxation for bilinear terms. Luedtke, Namazifar, and Linderoth [10] 
proved a number of statements about the strength of the resulting relaxations for multilinear 
functions. In this note we extend their results on bilinear functions. In particular, we 
characterize the bilinear functions for which the McCormick relaxation describes the convex 
hull, we improve the upper bound on this approximation ratio, and we prove that our new 
bound is asymptotically tight, thus providing a negative answer to a question from [10]. 

Consider a bilinear function b : [0, l] n —» R given by 

b(x) = dijXiXj 

ij&E 
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with coefficients G R, where G = (V, E ) is an undirected graph with vertex set V = 
{1,..., n}, and we write ij for {i,j}. The graph of b is the set 

B = {(a?, z) G [0, lfxR : z — b(x)}, 

and we are interested in relaxations of the convex hull of H, which can be characterized as 
(see [15]) 

{ 2 n 2 n 

(as, z) G [0, lfxl : 3A G A 2 n with x = A fc ai fc , z = A kb(x k ) > , 

k= 1 k= 1 J 

where ad, ..., ad" are the vertices of [0, l] n and A 2 n = {A G [0, l] 2 " : Ylk=i Afc = 1} is the 
(2 n — l)-simplex. The McCormick relaxation [11] approximates B by introducing for each 
bilinear term XiXj a new variable y t] together with the constraints 0 ^ yi 3 ^ X{, yij ^ x 3 and 
yij ^ Xi + Xj — 1. More precisely, we define two convex polytopes P = P(G) C and 

Q = Q(b) C R n+1 : 

P = {(aj, y) G [0, l] n x [0, l] |i?l : y tJ sj Xi , y l3 f x,, ^ Xi + Xj -1 for all ij G £}, and 

Q = < (as, z) G [0, lfxl : 3 ye [0,1]^ with (x, y) G P and z = ^ cnjVij 
l ij&E 



1 Main results 


We have conv(-B) C Q and it is natural to ask how well Q approximates conv(-B). Following 
the notation from [10] we denote the concave and convex envelopes of the graph of b by 
cav[5] and vex [b], respectively, and the corresponding upper and lower McCormick envelopes 
by rricu [b] and rricl [b ], respectively. These envelopes are functions from [0, l] n to R defined 

by 


cav[&](a;) = maxjz : (x,z) G conv(H)}, vex[6](a;) = minjz : (as, z) G conv(£>)}, 

mcu[6](cc) = max{z : (x,z) G Q}, mcl[6](cc) = min{z : (as,G Q}. 

We call the corresponding differences convex hull gap , denoted by chgap [b] , and McCormick 
gap , denoted by rncgap [b] , respectively. In other words, 

chgap[6](cc) = cav[6](a?) — vex [6] (a;) and mcgap[6](a?) = mcu[6](a;) — mcl[6](x). 

Our measure for the quality of Q as an approximation of conv(il) is the number 


(fib) = inf{c G R : rncgap [6] (a:) ^ c chgap [6] (a;) for all x G [0, l] n }. 


In [10] it is proved that under the condition that all nonzero coefficients are positive we have 


c*(b) ^ 2 


1 

MG)/ 2] ’ 


where x(G) is the chromatic number of the graph G. For arbitrary coefficients, the much 
weaker bound c*(b) ^ n is established, and it is left as an open question if c*(b) can be 
bounded by a constant independent of n in the general case. We provide a negative answer 
to this question by proving the following theorem. 
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Theorem 1. Let G = (V, E) be the complete graph on the vertex set V = {1,... , n}, and 
let b(x) = J2ijeE a ij x i x j where the coefficients aij are chosen independently and uniformly 
at random from {1, —1}. For x = (1/2,1/2,... ,1/2) we have 


chgap[&](cc) j = 1. 

Moreover, we show that yfn is the correct leading term for the asymptotics. 

Theorem 2. For every bilinear function b : [0, l] n — > M. and every x G [0, l] n , 

mcgap[6] (x) ^ 600\/n chgap[6] (x). 

In order to prove Theorem 2 we establish the following discrepancy result which might 
be of independent interest. 

Theorem 3. Let G = (V, E) be the complete graph on the vertex set V — {1,... ,n}, and 
let a = (aij) G R n( v- 0/2 a vector of edge weights. There exists a set U C V such that 


lim P ( mcgap[6](cc) ^ 

n—y oo \ 4 


n 


El a ij 

ms{u) 




El 


600-v/n 2 - 

v ijeE 


i 


where the sum on the LHS is over the set 5(U ) C E of edges with exactly one vertex in U. 

Finally, we give a characterization of the functions b with with Q = con v(B). Let us call 
an edge ij G E positive if a t j > 0 and negative if < 0. Without loss of generality we 
assume that a t j ^ 0 for all ij G E , so every edge is either positive or negative. The following 
theorem is a direct consequence of Theorem 3.10 in [12] which states that the McCormick 
inequalities are sufficient to describe the convex envelope of the graph of b if and only if the 
number of positive edges in every cycle is even. In order to capture the concave envelope as 
well we just need to ensure that every cycle also contains an even number of negative edges. 

Theorem 4. We have Q = con v(B) if and only if every cycle in G has an even number of 
positive edges and an even number of negative edges. 

As a consequence, we can have Q = con v(B) only if G is bipartite. Moreover, if G is a 
forest then Q = conv(B) for every choice of the coefficients , but as soon as G contains a 
cycle we can write down coefficients such that Q ^ conv(S). 

Our proofs are based on the following ideas from [10]. For a vector x G [0, l] n , let 
Tf = Tf(x ) C V be the set of indices of fractional values, i.e., Tf = {i G V : 0 < Xi < 1}. 
The proof of mcgap[&](cc) ^ n chgap[6] (x) for all x G H in [10] proceeds in 3 steps. 

1. x G {0,1/2, l} n ==► mcgap[6](a;) = ^ ^ |op-|- 

pS-E, i,j£T f 

2- x G {0,1/2, l} n =► chgap[6](x) ^ — )— ^ l a vl- 

f ij£E,i,j£T f 
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3. The function c chgap[ 6 ] (cc) — mcgap[ 6 ] (cc) is minimized at some x e {0,1/2, l} n . 

We will show that the argument for step 2 can be modified to provide a lower bound for 
chgap[ 6 ] (x) in terms of the difference between the maximum and the minimum cut in the 
subgraph of G induced by Tf. Theorem 1 then follows by applying the Chernoff inequality, 
Theorem 3, and consequently Theorem 2, is proved using probabilistic arguments that have 
been developed in the context of studying the discrepancy of graphs [3, 6 , 7], and Theorem 4 is 
a consequence of the observation that the difference between the maximum and the minimum 
cut is equal to the sum of the absolute values of all weights if and only if the sets of positive 
and negative edges form two cuts of the graph. 


2 Proofs of the theorems 


2.1 Characterizing the convex hull gap in terms of cuts 

Let G — (V, E) be a graph with vertex set V = [n\. We use the following notation from [10]. 

• For X C V, 7 (A") is the set of edges with both vertices in X. 

• For X C V, S(X) is the set of edges with exactly one vertex in X. 

• For A", Y C V with X D Y = 0, S(X, Y ) is the set of edges with one vertex in X and 
one vertex in Y. 


• For i G V, Si is the collection of vertex sets that contain i, i.e., S, = {W C h : i G W}. 


• For Z C E, we put a(Z) = J2ijez a L- 

We denote the maximum and the minimum weight of a cut in the subgraph induced by 
X C V with /i + (X) and /i _ (A), i.e., 


h + P0 = max } 




ijes(u u u 2 ) 


A i (X) = min ! Y 


“i3 


ijeS(Ui,u 2 ) 


u 1 uu 2 = x, u i n u 2 

u 1 uu 2 = x, u x n u 2 




We identify {0, l} n with the power set of V in the natural way: x G {0, l} n is identified 
with the set {i : Xi = 1}. We start by establishing that the upper bound for chgap[f>](cc) in 
terms of cuts in induced subgraphs of G, proved in [10] (Lemma 3.10), is tight. 


Lemma 1. Let x e {0,1/2, l} n and put = {i e V : = 1} and Tf = {i E V : Xi = 

1/2}. Then 


vex[ 6 ](aj) 

cav[b](x) 

chgap[ 6 ](a?) 


a(7ff)) + i45(T,,T,)) + 
«(7(Ti)) + ia(i(T 1 ,T / )) + 
5 Y(Tj) - . 


«(7 (Tf)) - 

(1) 

a(7( T /)) - \h~(T f ), 

(2) 


(3) 
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Proof. We start by writing vex[6](a;) as follows: 


vex [6] (a;) = min { E AxoWXUT,)) : E A * = 1 ’ E A a- — 1/2 Vi G Tf, A ^ 0 > 
[ xcr, A'CT/ Xe<s, ; J 

Now 

a( 7 (X U Ti)) = a( 7 (T 1 )) + a(5(T 1? X)) + a( 7 (X)), 
and, for any A satisfying ^ Xg5 . = 1/2 f° r all i G Tf , we have that 

E A x a(i(T 1 ,X))= E A a-E E “« = E E Y, x xa ii = 1 -a(S(T 1 ,T»). 

xc.Tf x ^= T f iex jeT lt ijeE jeTiieTf^jeE xeSi 

Thus 

vex [6] (a:) = a( 7 (Ti)) + ^a(<5(Ti, T/)) + M, 

where 


M = min ^ ^ A*a( 7 (X)) : ^ A x = 1, A* = 1/2 Vi G 7), A >0 

[ xc T f xcT f xeSi 

As in the proof of Lemma 3.10 in [10], we can set A u 1 = A u 2 — 1/2 for a maximum cut 
(Ui, U 2 ) in the subgraph induced by Tf, which yields 

M < 1 [a( 7 ((/0) + «(7(^))] = j [a(7(I») - l* + (T f )] . 

In order to prove that this bound is tight, we look at the dual 



M = max < y H— J2 Zi : y + Yl Zi ^ a( 7 (X)) VX C Tf 


i&T f 


ieX 


Setting y = —/i + (Tf)/ 2 and 


Zi = 


9 ^ for i G Tj 


/ 


jeTf.ijeE 

we get a feasible solution, because for every X C Tf we have 

If + E * = - Vp>) + Tj. \ X)) + o(7(X)) $ o(7(X)). 

i&X 

Since the objective value 

y + \ Y1 z i = -^ + ( T /) + \ a h = \ ( a ( T /) - m + ( t /)) 

ieTf i&Tf jeTf :ij£E 
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is equal to the upper bound for M we have proved that M is equal to this value, and this 
concludes the proof of (1). For (2) we use the same method to get 

ca v[b\(x) = a( 7 (Ti)) + ^a(5(Ti ,7») + M', 
where M' is characterized by 


M' = max < ^2 A A -a(7(X)) : ^ X x = 1, ^ A. v = - Vi G T f , A ^ 0 


XQT f 


XCT f 


xeSi 


= mm \y+~^2zi : y + ^Zi^ a( j(X)) VX C T f 

i&Tf i£X 


Taking a minimum cut (U[, U'f) we get a primal solution A u> = A*// = 1/2 and a corresponding 
dual solution y = — fi~(Tf)/2, 

z i = \ for i G T f- 

jeTf.ijeE 


Finally, (3) follows by taking the difference of (1) and (2). □ 

By Lemma 3.9 from [10], we have mcgap[6](x) = | X/je 7 (T f )l a *jl f° r all x £ {0; 1/2, l} n , 
and using the convexity argument from the proof of Theorem 3.12 in [10] we get the following 
corollary. 

Corollary 1. Let c be a number such that Yhij^x) l a bl ^ c (h + (-A") — for all X C V. 

Then for all x e [0, l] n , mcgap[6](cc) ^ cchgap[6](cc). □ 


2.2 The lower bound 


Proof of Theorem 1. Let G = (V, E) be the complete graph on the vertex set V = {1,..., n} 
and consider the bilinear function is 


b(x) = 


ij&E 


where the coefficients are randomly chosen from {1,-1} (independently and uniformly). 
Using the Chernoff inequality and the fact that <5 (C/i,C/ 2 ) ^ n 2 /4 for every cut(C/i, f/ 2 ), we 
have that, 


P 


E 


ijeSfUuUi) 


> 


0.6 n 3/2 


< 2e _0 ' 72n 


Taking the union bound over all 2 n 1 cuts gives 


-0.6n 3/2 E 


E 

ijGS(U lt U 2 ) 


aij ^ 0.6n 3 ^ 2 for all cuts (Ui, U 2 ) I ^ 1 


2 «g—0.72n 
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which tends to 1 as n —> oo. So 


lim P (n + (V) -fjT(V) < 1.2n 3/2 ) = 1, 

n—>• oo v / 


and consequently, for x = (1/2,1/2,..., 1/2), with probability tending to 1 as n —* oo, 


incgap[6] (x) = = n ( n — —— > 0.6n 3//2 ^ chgap[6](cc). □ 

Theorem 1 ensures that there are many functions with a large ratio between the Mc¬ 
Cormick gap and the convex hull gap. Next we construct an explicit example for every n. 
We define a bilinear function b : [0, l] n —» R as follows. Let k = |dog 2 (n)]. With vertex 
i G V — {1,..., n} we associate the vector i = (i 1; ..., i k ) e {0, l} fc of the digits of i — 1 
in binary representation, i.e., %— 1 — i\2° + i 2 2 l + • • • + ik 2 fe_1 , and we put = (—1)^’-^, 
where (•, •) is the standard scalar product, (i,j) = i\j\ + • • • + ikjk■ The following lemma 
is a standard discrepancy result (see for instance Chapter 10 in [4]), but for convenience we 
include the short proof. 

Lemma 2. We have /i + (G) ^ (n 3//2 )/\/2 and /a~(V) ^ — (n 3 / 2 )/\/2. 

Proof. Let H be the 2 k x 2 k matrix with rows and columns indexed by binary strings of 
length k with Hij = (—l)hb). Then H is a Hadamard matrix, i.e., H T H = 2 k I where / is 
the identity matrix of size 2 k x 2 k . Therefore, ||Lfn|| 2 ^ 2 fc / 2 1| v ||2 for every v. The vertices in 
V correspond to the first n rows and columns of H, and therefore we can identify a subset 
U C V with a vector u e {0, l} 2 . For a cut (U, V\U), let w be the vector corresponding 
to V \ U. We can bound the weight of this cut by 


a v 

= 

E E (-W> 

ijes(u) 


ieu jev\u 


NOW («! + •••+ U 2 k) + (Wi + • • • + W 2 k ) 


u t Hw\ ^ ||it|| 2 ||iLiy|| 2 ^ 2 fc/2 || ii|| 2 1| "in || 2 


n, and the AM-GM inequality yields 


\u 


whi, = $ w + -+^) + k+-+ to i.) 


(ui + • • • + U 2 k) T (rci + • • • + W 2 Pj 


= n/2 


Consequently, 


E - 

ij£S(U) 




< 2 k n 2 /A < n 3 /2. 


□ 


TV 


3/2 


From Lemmas 1 and 2 it follows that chgap[6](l/2,..., 1/2) ^ —= , and therefore 

V2 


mcgap[6](1/2,..., 1/2) = ^ ^ ^ 


n - 7= ) chgap[6](l/2,..., 1/2). 


So for n ^ 18 we have mcgap[6](l/2,..., 1/2) ^ —— chgap[6](l/2,..., 1/2). 
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2.3 The upper bound 

The unit weight case of Theorem 3 has been proved in [7], and here we extend this argument 
to the general case. We start with a partition V = L U R such that 



ije6(L,R) ij&E 


To see why such a partition exists, consider any random partition of vertices into two subsets, 
where with equal probability each vertex is assigned to any one of the subsets. Taking the 
edge weights to be | a t] |, the expected value of the resulting cut is \ YhijeE\ a ij\- Therefore, 
there exists a specific partition V = L U R which satisfies (4). 

Now we choose a random subset S C L (P (i G S) — 1/2 for every i G L and these events 
are independent). 


Lemma 3. For every j e R, 



Proof. Fix j G R, and let X % for % G L be the random variable defined by Xi = 1 if i G S 
and Xi — —1 if i S, so that 

a ij = 2 T g QijXi. 

i&S i£L i£L 


For Z = (J2,.er a ij X i ) 2 , we liave E ( z ) = 


and therefore 

ieL 



1 (EfeL<4) 2 

4 E(Z 2 ) 


(5) 


by the Paley-Zygmund inequality. From the Khintchine inequality with the Haagerup 
bounds [8, 13] it follows that 


E ( Z 2 ) 
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hence (5) implies 



P 





This gives the implications 


^2 ai i ^ 0 ==> ’ P ( a h ^ 9 /O ( a h 
ieL y*e5 zvz Viei , 

I] a o < 0 =► P \J 2 a ij < 


1 

> —, 
24’ 


l/2> 


iSL \ ieS 

and thus concludes the proof of the lemma (using 1/4 < l/(2\/2)) 
From Lemma 3 and Cauchy-Schwarz we obtain 

1/2 

96 


1 

/> —, 
24’ 


e (E 

^jeR 


E' 

ies 


"13 


j'S-R \iG-L 


^ - - S ( |L|l/2El^l 

jeR V 1 ieL 


Cii 


□ 


^ ^ 200^^ |a * Jh 

v ieL jeR v ijeE 


where the last inequality follows from (4). This implies that there exists a set S C L with 


E 

jeR 


E' 

ies 


Hj 


> 


200 s/n ^ 


J 2 ]a 


w 




Fix such a set S' and define the sets 


R + = 


j J e R ■ , 

l ieS J 


R- = 


IjeR : y CLij < 0 > . 
I ieS J 


Then 


E 

jeR 

and it follows from (6) that 


E' 

ies 


HJ 


y y a n y y 

/ei?+ ies /ei?_ «gS 


( 6 ) 


max < J/ J/a,, E“»L 

/ei?+ ieS jeR- ies 


Ei 


400 \/n 


Without loss of generality, we assume that the maximum is obtained by the first term, i.e., 


y y a ij > 

jeR+ ieS 


El 


v ijeE 


We conclude the proof of Theorem 3 as suggested in [14]. Let W = V \ (S U R + ) and 
distinguish three cases. 
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Case 1. If a t j ^- -= y^lm.l then 

. . ^ 3 1200 y/n 4^} 31 


we can take U = S: 


ijes(s,w) 


ij£E 


Z a v+ Z ai ^ ( 400 i2oo)v^^ |aiJI 

ije5(S) ijeS(s,R+) ijeS(s,w ) v 7 v ijeE 


Case 2. If a,-,- >-— y^ la,-., then we can take U = i/+: 

Z^ J 1200 31 + 

ijeS(R+,w) v ijeE 


Z a ’i = Z 


a,; 


v a,,. > ( - L') _L y 

Z^ y \ zinn 1 9 nn / , at Z^ 


Z^ y ^ V 400 1200 / ^ * 

pe<5(R+) ije5(S,i?+) ije5(/J+,w) v 7 v ijeE 


CLi 


Case 3. If max { Z z 


e /j / <C 


S' U R + : 


ijeS(S,w) ijeS(R+,w) 


- ,Vlo,| then 

1200 vSI Z^' vi 

v ijeE 


we can take U = 


Z1 a *i — Z^ a * J Z/ 


a u — 

iie<5(5ui?+) ye5(s , ,w) ije<5(R + ,iy) 


dij A 


600-y/n 


Z 

u'e£ 


a ij I • 


Proof of Theorem 2. Applying Theorem 3 to the subgraph induced by a vertex set A" C V, 
yields 

u + (x) - u-(x) —Z= y io«|>—Z y 
eooyiy ,.^, 1 ^ 


0*1j | ? 


600^ ^ ' J 

ije -y(x) 


and now Corollary 1 implies the statement of the theorem. 


□ 


2.4 Characterization of equality 

As mentioned in Section 1, Theorem 4 is a direct consequence of Theorem 3.10 in [12]. We 
include the following short proof in order to show how this result can be derived from the 
correspondence between the convex hull gap and the range of cut weights in the graph G. 

Proof of Theorem f. Suppose that every cycle in G has an even number of positive edges 
and an even number of negative edges. Now let X C V be any vertex set. We introduce two 
equivalence relations, ~i and ~ 2 , on X. For the first, we put i ~i j if G contains a path 
between i and j consisting of positive edges. Similarly, we put i ~2 j if G contains a path 
between i and j consisting of negative edges. Let G\ and G 2 be the quotient graphs, i.e., the 
vertices of G^ (k = 1 , 2 ) are the equivalence classes for and there is an edge between two 
classes [z] and [j] in G^ if there is an edge in G between any element of [/] and any element 
of [j]. Note that the edges in G\ correspond to negative edges of G, and the edges in G 2 
correspond to positive edges of G. If every cycle in G contains an even number of positive 
and negative edges, then G\ and G 2 are bipartite. The partition of G\ induces a partition 
X — Ui U U 2 such that S(Ui, U 2 ) is the set of negative edges in 7 (A), and the partition of 
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G 2 induces a partition X — U[ U U' 2 such that 5(U[, U 2 ) is the set of positive edges in ^{X). 
Consequently, p + (X) — / i~(X ) = \ a ij\i an d, since ICf was chosen arbitrarily, it 

follows, by Corollary 1, that rncgap[6] (a;) = chgap[6](cc) for all x G [0, l] n . 

Conversely, suppose that there exists a cycle that has an odd number of negative edges. 
Then any cut of G that contains all negative edges in the graph, i.e., that contains the 
set E~ = {ij G E : aij < 0}, must contain at least one positive edge. This implies 
»-(V) > E,,cf «.) So 

^{v)-ur(v)<Y l W «l. 

ijeE 

and consequently, by Lemma 1, chgap[6](1 /2,..., 1/2) < mcgap[6](1/2,..., 1/2). The argu¬ 
ment for a cycle with an odd number of positive edges is similar. □ 

Theorem 4 implies that for functions without negative coefficients we have Q = conv(R) 
if and only if G is bipartite, where the “if”-part of this equivalence follows from Theorem 
3.10 in [10]. In contrast, without restricting the signs of the coefficients bipartiteness does 
not help. The probabilistic argument in the proof of Theorem 1 also works for the complete 
bipartite graph with equal parts and yields that in this setting almost all functions b with 
coefficients in {1, —1} have mcgap[6](a?) ^ ^ chgap[6](cc) for x = (1/2,..., 1/2). 
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